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Abstract 

A prototype flash analog-to-digital readout system for cosmic ray detection at en- 
ergies below 10 18 eV has been designed and tested at Columbia University Nevis 
Laboratories. The electronics consist of a FADC module that digitizes 16 photo- 
multipliers at 40 MHz with 14-bit dynamic range. The module is read out to a PC 
(running Linux) through a PCI interface. Taking advantage of the large bandwidth 
provided by the PCI bus, we have implemented a software-based data acquisition 
system. This note describes the software and electronics, as well as preliminary tests 
carried out using a prototype FADC module. 
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1 Introduction 



In the past few years, several proposals have been made to extend the ob- 
servation of ultra high-energy cosmic rays (UHECRs) to energies well below 
10 18 eV [1,2], the threshold of currently operated UHECR detectors. Previous 
measurements have indicated that at these energies, the cosmic ray energy 
spectrum, a steeply falling power law in energy E~ a , with spectral index 
a ~ 3, shows structure, and the chemical composition undergoes a change 
from a heavier, iron-dominated mixture to a lighter, proton-dominated com- 
position [3]. 
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Changes in composition and in the index of the energy spectrum could in- 
dicate a transition in the sources of cosmic rays, from a Galactic origin at 
lower energies to an extragalactic origin at higher energies. The transition re- 
gion, between the "knee" of the spectrum — a steepening of the index around 
10 15 eV [2] — and the start of the ultra high-energy regime above 10 18 eV, has 
not been observed in detail. The basic requirements for a detector measuring 
cosmic rays at such energies are excellent energy resolution, good angular res- 
olution, and the ability to discriminate between different cosmic ray primaries, 
at least on a statistical basis. 

A measurement technique that meets these requirements is the air fluorescence 
method, in which a detector observes the nitrogen fluorescence light generated 
by secondary air shower particles in the atmosphere. Most of the fluorescence 
light is emitted at wavelengths between 300 nm and 450 nm, requiring a UV- 
sensitive instrument. Typically, an air fluorescence detector is composed of a 
large number of optically fast "cameras," each collecting shower light with a 
wide-angle mirror, and using several hundred photomultipliers (PMTs) with 
fast electronics to image the developing shower in real time. 

The fluorescence signal is weak compared to other light sources, such as moon- 
light, but can be observed over a dark sky background. To maximize signal 
to noise, the camera mirrors tend to be large, on the order of a few square 
meters, and the PMTs observe a relatively small portion of the sky (usually 
~ 1° x 1°). Currently operating cosmic ray detectors that employ the air flu- 
orescence technique are the High Resolution Fly's Eye (HiRes) experiment in 
Utah [4], and the Pierre Auger Observatory in Argentina [5]. 

Since air fluorescence detectors observe extensive air showers as they develop 
in the atmosphere, the technique yields not only accurate measurements of 
shower geometry, but also calorimetric estimates of the primary particle en- 
ergy. However, the measurements are also plagued by several major technical 
difficulties. The fluorescence signal strength from cosmic ray air showers varies 
greatly due to shower-to-shower fluctuations. In HiRes, for example, the typi- 
cal shower signal ranges between 200 photoelectrons to several thousand in a 
100 ns time window [4]. Moreover, the signal sits atop a slowly varying dark 
sky background — on the order of tens of photoelectrons per ^s per PMT 
- that can increase by an order of magnitude when a bright star crosses the 
PMT border [4]. Finally, while the duration of each shower is short, of order 
/is, event rates are high, on the order of kHz. 

The major challenges posed by extending the air fluorescence technique below 
10 18 eV include the large dynamic range of the fluorescence and background 
signals, and the large increase in rate at lower energies. To accommodate 
such observations, we have designed and partially implemented a fully digi- 
tal readout system for an air fluorescence telescope. In its current form, the 
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readout system, which has been tested at Columbia University Nevis Lab- 
oratories, contains three basic components: a subcluster of sixteen Photonis 
XP3062 photomultipliers; an "FADC module" responsible for digitizing the 
PMT outputs and making basic trigger decisions; and a compact PCI board 
that handles two-way communication between the FADC electronics and a 
data acquisition (DAQ) PC running Linux. 

In this paper, we discuss the components of the electronics system in detail, 
and describe the results of basic calibrations. The paper is organized as follows. 
Sections 2.1 and 2.2 give a brief description of the design consideration for the 
development of a DAQ system for air fluorescence detectors at lower energies. 
We then describe the FADC module (Section 2.3), the PCI readout board 
(Section 2.4), and the data acquisition system (Section 2.5). Some benchmarks 
are discussed and summarized in Section 3. 



2 Description of the System 

2.1 Design Considerations 

In the design of the readout electronics, we attempted to follow three guiding 
principles. First, so as to limit analog noise and ease the signal processing re- 
quirements, the electronics are set up to digitize the PMT outputs immediately 
after integration and shaping by two preamplifiers. All subsequent monitor- 
ing and triggering tasks are performed on the digitized waveforms. Second, 
the hardware controller does not require a cumbersome VME interface; ac- 
cess occurs through a custom designed PCI card using simple function calls. 
Third, the system offloads most trigger decisions to the DAQ software, taking 
advantage of the speed of the DAQ PC. This greatly simplifies the overall 
requirements of the electronics and guarantees maximum flexibility in the im- 
plementation of various trigger schemes. The large bandwidth of the PCI bus 
(130 MB s^ 1 nominal) easily accommodates the large flow of data from the 
FADC module. 



2.2 System Overview 

A basic outline of the PMT readout chain is as follows. The PMT bases 
contain a simple board to supply HV and read out the anode currents. After 
the anode output is integrated by a preamplifier, a high bandwidth FireWire 
cable transfers the signal to a 14-bit FADC, where it is digitized at 40 MHz. 
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Fig. 1. Logical setup of an FADC channel. In the trigger path, the FIR integration 
time, external baseline, threshold, counter level, and n-fold parameter are set by 
software knobs. 



Each FADC is located on a board with sixteen channels in total (i.e., one 
board reads out sixteen PMTs.) Monitoring the sixteen channels is an Altera 
FPGA, which tracks the channel baselines with a simple baseline finder, and 
implements a simple threshold and coincidence trigger scheme for the PMT 
"mini-cluster" it controls. The FPGA firmware has also been programmed as 
an event builder, packaging the sixteen tube signals into a simple data format 
- word count, time stamp, and channel outputs - for later processing. 

In our mini-cluster, the FPGA-FADC board cannot alter the PMT gains. 
This approach is taken because the large dynamic range of the FADCs allows 
us to avoid using an overflow channel for saturated tubes. Hence, we do not 
normalize the inputs to the board, aside from some course gain adjustments 
at the tube level. 

Once the FPGA packages the FADC signals into a data block, the block moves 
over another high-bandwidth connection to a PCI card, which provides the 
interface to hard disk storage. The bandwidth of the PCI bus allows the card 
to rapidly and transparently move events from the detector directly to PC 
memory, where they are saved in a large buffer. Once in memory, the data 
can be read from the buffer, whereupon higher-level triggers may be applied 
to the PMT cluster output. 

In the following sections, we give a more detailed description of each of the 
components of the system. 



2.3 FADC Module 



The FADC electronics are composed of two PCBs: a single Digital Signal 
Processing (DSP) board that accepts differential analog signals from sixteen 
phototubes, and a backplane that receives i/o and clock input from a PC via 
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Fig. 2. Effect of the channel counter, which doubles the length of time the 
above-threshold signal can be used for coincidence triggers. 

the PCI interface. Analog data from the sixteen phototubes are transported 
to the DSP board by two high-bandwidth Mini D ribbon cables, while the 
DSP and backplane communicate via three MZP board-to-board PCB plugs. 

After arriving at the DSP board, the integrated and shaped analog output 
from each tube in the PMT subcluster is processed by two high-speed differ- 
ential amplifiers and digitized by a 40 MHz, 14 bit, 300 mW flash analog-to- 
digital converter (FADC). The particular converter used in the FADC module 
(Analog Devices AD9244) was selected for its good balance between high dig- 
itization rate and low power consumption [6]. 

Following digitization by the sixteen FADCs, each input channel splits the 
digitized data along two paths, as shown in Fig. 1: a trigger path for signal 
processing by an Altera Stratix FPGA [7]; and a deep memory path to store 
the data while the trigger decision occurs. The memory path is implemented 
by a lkxl4b FIFO, which is sufficient to store the data for 25 /is during the 
trigger stage. 

In the trigger path, the digitized photomultiplier waveforms are integrated 
over one, two, four, or eight successive samples by a FIR filter implemented in 
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the Stratix FPGA. The integration time of the filter is adjustable by a DAQ 
software knob. The integrated output is compared to the sum of user-supplied 
external baseline and threshold parameters, and then is further discriminated 
by a counter. The external channel baseline, trigger threshold, and counter 
discriminator window are unique to each channel. 

At each ADC clock, the counter increments when the PMT signal is above 
threshold, and decrements when the signal drops below threshold. A discrim- 
inator monitors the counter level, and when it exceeds a tunable base value 
for a tunable number of clock cycles, the discriminator will enable the trigger 
channel. Hence, the counter can effectively double the length of time available 
for time coincidence between channels when compared to using the signal 
alone (see Fig. 2). 

Note that the counter discriminator, used in this manner, biases air shower 
detection toward low-energy events, which tend to occur relatively close to the 
detector and give rise to a strong light signal at the camera, and against high- 
energy events, which tend to occur farther away, have a larger spread in signal 
times, and have relatively low light levels. The bias toward close showers not 
only reduces the overall event rate, but also decreases the uncertainty caused 
by light transmission from distant showers through very long paths in the 
atmosphere. 

Once the trigger channel is enabled, data stored in the memory paths are 
packed into 18-bit words, built into an event, and moved to the PC. Note that 
such a cluster readout can be enabled in two ways. First, if some number of 
above-threshold channels are triggered in time coincidence, the stored data 
from the entire cluster will read out to the PC. The number of channels re- 
quired for coincidence is set by an n-fold parameter in software. Second, the 
DAQ software may make an explicit data request — as is often useful during 
testing and baseline monitoring — at which point all channels will read out 
for an adjustable length of time, independent of their trigger states. 

As mentioned earlier, the channel baselines, thresholds, discriminator settings, 
and the time coincidence number may be set dynamically from the host PC. 
In this way, the DAQ software can monitor each channel offline and adjust the 
trigger parameters for the subcluster to raise or lower trigger rates depending 
on drifts in background light levels. However, the firmware also monitors the 
signals online, tracking an internal baseline and variance for each channel by 
averaging the FADC outputs over 256 clock cycles. If the user chooses, this 
fast internal baseline can be used for trigger decisions rather than the slower 
external baseline. 
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Fig. 3. Architecture of the Xilinx Spartan FPGA aboard the PCI card. 



24 PCI Readout Board 



When a trigger occurs in the FADC module, the data are sent to the DAQ host 
computer, an Intel PC running Linux. The host communicates with the FADC 
module through a compact PCI board, a 32 bit, 33 MHz PCI accelerator. 
The PCI board is driven by a PCI 9056 chip from PLX [8], and has three 
connections to the FADC module backplane: two high speed Fiber Channel 
cables for control and data, and a USB 2.0 link for clock. 

The PLX PCI 9056 chip has several very convenient features suitable for 
our application. First, it implements a DMA engine for direct memory access 
transfers into the host memory, freeing CPU resources for DAQ functions and 
disk i/o. The card actually contains two DMA channels, which we use for the 
transfer of control and data. Second, the PLX drivers allow the chip to operate 
in so-called direct slave C-Mode [8], in which locations in the PCI address space 
are mapped into host memory. This allows the DAQ host direct read and 
write access to the PCI 9056 registers and addresses, greatly simplifying the 
controller software and significantly reducing the overhead of DMA transfer 
set ups. 

The second major component of the PCI card is a Xilinx Spartan-3 FPGA [9], 
which is responsible for managing control requests sent to the FADC module 
and data heading to the host PC. To handle control and data, the Spartan de- 
vice implements four FIFOs in firmware: two for control transfers (control_in 
and control_out) and two for data transfers (data_in and data_ptr), as 
shown in Fig. 3. The control FIFOs are responsible for setting and receiv- 
ing the FADC status. The data_in FIFO contains data events sent from the 
FADCs and written to a DMA buffer, while data_ptr stores a list of buffer 
addresses marking the start of each event. Finally, there is an additional FIFO, 
data_out, that is responsible for blocking transfers from the FADC module 
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Fig. 4. Physical layout of a DMA event buffer. 



when the data FIFOs are full. 

When the PLX card operates in direct slave mode, the address spaces of the 
Spartan are also mapped into PC memory. Hence, the DAQ host can operate 
the FADC module transparently and efficiently. Module commands are sent 
via writes to the control FIFO status registers; the DMA engine is initialized 
via direct writes to the PCI 9056 registers; and events are directly read from 
the data FIFO addresses. 

The PLX card is nominally capable of very large data transfers, up to 8 MB 
in a single transfer at a rate of 130 MB s _1 . In benchmarking tests conducted 
on a Windows PC at Nevis, we have observed sustained transfer rates of 
~ 80 MB s -1 . 



2.5 Software DAQ 



The operation of the card and data acquisition system in the PC is fairly 
straightforward. Its primary responsibilities are to initialize the FADC module 
for data-taking; prepare the PCI card for DMA transfers; allocate sufficient 
memory to store events from the FADCs; perform software-level triggers on 
incoming events; and write passed events to disk. 

The DAQ host is an Intel PC running Linux kernel 2.4, chosen for compatibil- 
ity with the PLX PCI device driver. At program startup, the DAQ software 
must allocate large, contiguous blocks of physical RAM — in sections of up 
to 8 MB of main memory — for DMA transfers. Each DMA buffer, shown in 
Fig. 5, contains a list of event pointers and actual events. An event pointer 
consists of an address word and an event size word, which are determined 
during processing in the FADC module and Spartan-3 FPGA. The events 
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Fig. 5. Logical setup of the software DMA ring buffer, set up for asynchronous 
readout. 

themselves contain an event type, a time base, a list of baseline averages and 
variances for the entire cluster, and the actual FADC traces. The number of 
trace samples in each event, which determines the event size, is set by the 
control software. 

The large volume of data moving over the PCI bus into main memory requires 
the host to allocate large blocks of contiguous physical memory (the PLX 
DMA engine can move up to 8 MB in one DMA transfer). To overcome the 
memory allocation constraints set by the operating system, we patched the 
Linux kernel with a video module that allows users to reserve hundreds of 
MB of physical RAM at boot time [10]. A small alteration to the PLX Linux 
driver makes this memory available at run time to the PLX API. 

Since the memory allocated for DMA transfers must be safely handled by the 
DAQ process, the DAQ software wraps the PLX memory allocation functions 
inside a C++ DMA buffer class. The class constructors and destructors auto- 
matically allocate and deallocate blocks of reserved memory in a transparent 
manner, safely returning the memory to the operating system even after pro- 
gram failure. 

The abstraction of memory regions allows the software to easily handle a 
second important requirement of the DAQ: namely, the need to store new 
data as older events are still being processed. For this purpose, the boot-time 
memory region is divided into DMA buffer blocks, and a reference to each 
buffer is stored in a linked list, creating a "ring buffer" in software (Fig. 5). 

In our implementation, we have divided the contiguous memory space of about 
100 MB into blocks of 4 MB to 8 MB, matching the maximum DMA trans- 
fer size of the PCI interface and containing about 3 to 6 ms of FADC data. 
These blocks are linked into a ring structure, and the DAQ operates by iterat- 
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ing through the ring, analyzing the data block-by-block. Rather than making 
synchronous read and write requests for each buffer, the software splits reading 
and writing into two threads. The first thread starts and stops DMA writes 
into each buffer, while the second iterates through the ring and reads data 
after they have been written. Buffer writing is quite efficient, as the memory 
mapping feature of the PCI interface allows the software to start and stop 
DMA transfers simply by clearing or setting four PCI 9056 registers. More- 
over, such a data acquisition scheme automatically prevents runtime errors 
like writing into unread buffers and reading from unwritten buffers, as each 
thread blocks access to the particular buffer it is using. 

When a buffer is read, the software must rapidly make a trigger decision — for 
instance, further threshold calculations, timing cuts, or geometrical triggers 
on phototubes. If the data survive the cuts, they are saved to a RAID-1 disk 
and the buffer lock is removed. 



3 Discussion 

Using the partial readout system of sixteen photomultipliers, the prototype 
FADC module, the PCI board, and a host PC, we have carried out several 
simple light calibration tests at Nevis. Placing the PMTs in a dark box and 
pulsing them with a blue LED (attenuated with neutral density filters), we 
observed the single-electron response of the subcluster. 

At the typical operating gain of the phototubes (5 x 10 4 ) we found that one 
photoelectron corresponds to approximately one ADC count. This suggests 
that the dynamic range of the FADCs is sufficient to view showers in the de- 
sired energy range without the need for additional low-gain overflow channels. 
Changes in the background light level can be accounted for by dropping and 
raising the threshold and discriminator constraints in the DAQ software. 

Within the DAQ software itself, we have implemented further simple threshold 
triggers to analyze the data, but we have yet to build a geometrical trigger 
for a full camera (sixteen pixels is not sufficient). To use this readout in a 
full telescope will require an intermediate readout board to collect data from 
sixteen or more FADC modules. 
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